Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enhanced Terminal Error Handling and Alert System #797

Merged
merged 11 commits into from
Dec 23, 2024

Conversation

thecodacus
Copy link
Collaborator

@thecodacus thecodacus commented Dec 17, 2024

Enhanced Terminal Error Handling and Alert System

Overview

This PR introduces a comprehensive error handling system for terminal operations, featuring user-friendly error alerts and automatic LLM-powered error resolution. The system gracefully manages command execution lifecycles, including proper abortion of superseded commands.

Key Changes

1. Terminal Error Management

  • Added ActionCommandError class for structured error handling
  • Implemented error output capture and formatting
  • Added intelligent terminal output cleaning that preserves important information
  • Created a new alert system to display terminal errors in the UI

2. Command Execution Improvements

  • Implemented graceful command abortion when new commands are initiated
  • Added proper cleanup of aborted processes
  • Enhanced command execution state tracking
  • Improved terminal output handling and formatting

3. UI Enhancements

  • Added new ChatAlert component for displaying terminal errors
  • Integrated alert system with the main chat interface
  • Added "Fix Issue" button that automatically prompts LLM for error resolution
  • Enhanced terminal output readability with smart formatting

4. Error Resolution System

  • Added automatic LLM prompting for error resolution
  • Implemented one-click error fixing through chat interface
  • Added context-aware error message formatting
  • Enhanced error output cleaning for better LLM comprehension

Technical Details

Note: The code snippets below are high-level representations for illustration purposes only. They are simplified to demonstrate the core concepts and patterns. For actual implementations, please refer to the diff in this PR.

Error Handling System

// Simplified example - See actual implementation in action-runner.ts
class ActionCommandError extends Error {
  constructor(message: string, output: string) {
    const formattedMessage = `Failed To Execute Shell Command: ${message}\n\nOutput:\n${output}`;
    super(formattedMessage);
    this._header = message;
    this._output = output;
  }
}

Command Lifecycle Management

// Simplified example - See actual implementation in shell.ts
async executeCommand(sessionId: string, command: string, abort?: () => void): Promise<ExecutionResult> {
  // Handle existing command abortion
  const state = this.executionState.get();
  if (state?.active && state.abort) {
    state.abort();
  }

  // Execute new command
  const executionPromise = this.getCurrentExecutionResult();
  this.executionState.set({ 
    sessionId, 
    active: true, 
    executionPrms: executionPromise, 
    abort 
  });
}

Alert System Integration

// Simplified example - See actual implementation in ChatAlert.tsx
function ChatAlert({ alert, clearAlert, postMessage }: Props) {
  return (
    <div className="rounded-lg border bg-bolt-elements-background-depth-2 p-4">
      <div className="flex items-start">
        {/* Error display */}
        <div className="ml-3 flex-1">
          <h3>{alert.title}</h3>
          <p>{alert.description}</p>
          
          {/* Action buttons */}
          <div className="mt-4">
            <button onClick={() => postMessage(`*Fix this error on terminal*`)}>
              Fix Issue
            </button>
            <button onClick={clearAlert}>
              Dismiss
            </button>
          </div>
        </div>
      </div>
    </div>
  );
}

Testing Considerations

  • Verify error capture and display across different error types
  • Test command abortion in various states
  • Validate error message formatting and clarity
  • Check LLM prompt generation for different error types
  • Verify alert system interaction with chat interface

Migration Notes

  • No breaking changes to existing terminal functionality
  • Error handling is automatically enabled for all terminal operations
  • Alert system integrates seamlessly with existing chat interface
  • No changes required for existing command execution code

Future Improvements

  • Add support for capturing ui console error
  • Add support for different alert types beyond errors
  • Enhance error pattern recognition
  • Improve LLM prompt templates for error resolution

Preview

terminal.error.capture.demo.mov

Updates

updated the UX of the Alert
image

@thecodacus thecodacus changed the title feat: Enhanced Terminal Error Handling and Alert System feat: enhanced Terminal Error Handling and Alert System Dec 17, 2024
@wonderwhy-er
Copy link
Collaborator

Looks good, wanted something like this
Will test now and merge if works well

@wonderwhy-er
Copy link
Collaborator

May be add to readme that it was done too

@wonderwhy-er
Copy link
Collaborator

Hm, I did get some issues...
Red cross looks like button to close/dismiss dialog but I guess its not

Also I got such interesting situation:
image

If I understand correctly it captured wrong part of terminal that was actually ok after part of terminal that did fail before it.

It kinda run npm install that failed after which it run npm run dev that did succeed and it got into

Did not yet check into code why that could happen.

May be I would change messaging to "there were errors running commands in terminal, ask AI to take a look?"
May be not as whole box but as some kind of suggestion button. Make it smaller.

Just some thoughts from my side.

I am good to merge this, just that part where it captures wrong error test from terminal makes things confusing for users who do not understand what happened

Copy link
Collaborator

@mrsimpson mrsimpson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thecodacus first off: That's a great PR! Thanks for preparing / documenting this so extensively!

I don't only want to look at the (very helpful) video and respond "lgtm", but it will take some time as the diff is huge.
At first glance, there are a couple of aspects I want to clarify / where I'm in doubt hoping this helps:

As you nicely introduced, there are different aspects covered in this PR and I wonder whether these separate capabilities should be combined in the way that I currently understand:

ActionAlert
You called it ActionAlert and not ActionError. What's the resoning behind that? I saw you also have type: 'error' | 'info';, but I cannot think about where type: info should be used / how the workflow handling information messages should actually be. Why not limit this to ActionError?

If it was an ActionError, I think about whether there could be other errors that could be resolved in a similar way (throwing an Error which allows the LLM to try to fix it). If this was the case, then this would be something like an ActionableError which I'd love to see. I can e. g. imagine that exceeding a context length for an LLM might be such an actionable error, with the resulting action that the workbench reduces the context.

Alerting
You are also introducing Alerting as a component for the first time. Currently, we do have a couple of places in which we propagate information about something that didn't work as expected to the user. And we don't have this in a central place. Would it make sense to have an AlertManagement component which provides state for all messages towards the user (and not have it in the workbench-state)?
From such a state (which holds an array of alerts then), ActionableErrors could be propagated to the Chat.
I'm a bit reluctant to think about introducing this here, as it might be over-engineering, but I feel that introducing error handling all over the place is... not the smartest idea with respect to regression / testability.

Tests
Speaking of tests: I'd love to have automated tests for that. It would alo make sense to have real unit-tests for that (e. g. that an ActionAlert is properly bubbled, rendered, resolved), but also an integration test would be helpful.

But frankly: If we are in a sort-of archaic mode now, I'm also fine to just merge it 😬

@thecodacus
Copy link
Collaborator Author

ActionAlert You called it ActionAlert and not ActionError. What's the resoning behind that? I saw you also have type: 'error' | 'info';, but I cannot think about where type: info should be used / how the workflow handling information messages should actually be. Why not limit this to ActionError?

I wanted the alert component to have option to add other types of alerts for future usecases like if you see the commercial bolt, llm can suggest some solutions sometimes which are just suggestions from llm and use can either take that or reject. so I didn;t wanted to make the alert limited to only error and be a generic system

Alerting
You are also introducing Alerting as a component for the first time. Currently, we do have a couple of places in which we propagate information about something that didn't work as expected to the user. And we don't have this in a central place. Would it make sense to have an AlertManagement component which provides state for all messages towards the user (and not have it in the workbench-state)?

I have added a central place for this in workbenchStore (as this component is already holding all the other stores)
you can use this anywhere to get alert notifications

but yes if the implementation becomes complex and we have lots of other alerts and stuff I do think pulling it out to make its own component is best option

If it was an ActionError, I think about whether there could be other errors that could be resolved in a similar way (throwing an Error which allows the LLM to try to fix it). If this was the case, then this would be something like an ActionableError which I'd love to see. I can e. g. imagine that exceeding a context length for an LLM might be such an actionable error, with the resulting action that the workbench reduces the context.

Yes I like this idea of ActionableError which can be parent class all the errors that llm can fix.

@thecodacus
Copy link
Collaborator Author

thecodacus commented Dec 18, 2024

Red cross looks like button to close/dismiss dialog but I guess its not

got it, will try to change that to something else.

It kinda run npm install that failed after which it run npm run dev that did succeed and it got into

Did not yet check into code why that could happen.

got it, so the this is an issue of capturing exit code of previous commands, which we where discussing in on other PR.
i have fix for that issue here #789 , these two can work well together,
i wanted this to be in the same PR, but i had already raised this PR before starting this one

May be I would change messaging to "there were errors running commands in terminal, ask AI to take a look?"

yes I am not very good at writing lines 😅 , I will update the wording.

May be not as whole box but as some kind of suggestion button. Make it smaller.

I saw how bolt.new alerts user for error. so this UX immediately came to my mind, I am up for a better UX 👍

@mrsimpson
Copy link
Collaborator

llm can suggest some solutions sometimes which are just suggestions from llm and use can either take that or reject.

Yes, that's what I guessed you wanted to achieve. However, I think that a suggestion is different compared to surfacing an error. And might be presented differently. And there may be multiple at the same time.

=> would you be fine to limit this PR to Errors? Once we come to suggestions, we can revisit that either way.

@thecodacus
Copy link
Collaborator Author

thecodacus commented Dec 18, 2024

I am good to merge this, just that part where it captures wrong error test from terminal makes things confusing for users who do not understand what happened

@wonderwhy-er yes makes sense. can you provide me the chat export for me to do some testing on that?

would you be fine to limit this PR to Errors? Once we come to suggestions, we can revisit that either way.

@mrsimpson sure I will just update the chat alert component and the interface

@wonderwhy-er
Copy link
Collaborator

can you provide me the chat export for me to do some testing on that?
Weird but that chat is gone... Needed to save right away.

And can't reproduce so nevermind.

And current state of reloads makes me wonder if we can when loading chat wait to run commands till end end, each one once, instead of rebuilding multiple times.

Something to test.

@wonderwhy-er
Copy link
Collaborator

And on reload I do get error messages again for things that were already fixed in later messages.
Also problematic and confusing for those who do not know what is happening.

@thecodacus
Copy link
Collaborator Author

And on reload I do get error messages again for things that were already fixed in later messages. Also problematic and confusing for those who do not know what is happening.

yeah. was thinking about that, figuring out a way to fix that

@faddy19
Copy link

faddy19 commented Dec 20, 2024

I was not able to capture these errors and feed it back to the LLM. As there are not only Terminal Errors but also errors only appearing when rendered in the browser. We might also include these into the fix errors component. Let me know what you think

image

@faddy19
Copy link

faddy19 commented Dec 20, 2024

We might have to think of ways how to capture any error in the browser console and feed it back. There will be always errors.

@thecodacus
Copy link
Collaborator Author

yes That will need some brainstorming, I have added that in the PR description as future improvements

@faddy19
Copy link

faddy19 commented Dec 20, 2024

When the error logs from the browser console are copy pasted into the llm 99% of all errors get resolved and the preview works again. I spent around 3 days to log these errors from the browsers console but was unable to.

Once we get these errors into our event logs we can feed the error into the Chat LLM and it heals the errors.

We have to collect all errors from Terminal and from Browser console. We will feed that back to the llm and 100% will get solved and we are not stuck with a blank preview or stopped file creation.

The most challenging part is getting the browser consoles logs.

Any ideas? I tried nearly everything.

@wonderwhy-er
Copy link
Collaborator

When the error logs from the browser console are copy pasted into the llm 99% of all errors get resolved and the preview works again. I spent around 3 days to log these errors from the browsers console but was unable to.

Once we get these errors into our event logs we can feed the error into the Chat LLM and it heals the errors.

We have to collect all errors from Terminal and from Browser console. We will feed that back to the llm and 100% will get solved and we are not stuck with a blank preview or stopped file creation.

The most challenging part is getting the browser consoles logs.

Any ideas? I tried nearly everything.

Lets explore that in separate PR.
I agree with you that javascript errors is next thing after terminal errors.
I do have some ideas but need to explore.

@faddy19
Copy link

faddy19 commented Dec 20, 2024

image

@faddy19
Copy link

faddy19 commented Dec 20, 2024

Yes absolutely we can discuss that in a different PR.

It is not just JavaScript errors but we have to collect, group and feed the Browser Console Errors into the LLM. These are connection errors, JavaScript errors, broken code errors, any error basically.

The challenge is how do we get it exactly as shown in Browser console.

Please create a PR and lets continue there. Thanks

@faddy19
Copy link

faddy19 commented Dec 20, 2024

@wonderwhy-er let me know so we can advance on these errors. I guess this is a core blocker for bolt diy as I get this quite often. Maybe we get that fixed in a couple days if we put the brains together

Copy link
Collaborator

@dustinwloring1988 dustinwloring1988 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worked great on windows 11,

@wonderwhy-er
Copy link
Collaborator

Here checked how to get errors
Early exploration, needs to be connected to @thecodacus flow
#856

@thecodacus
Copy link
Collaborator Author

Here checked how to get errors Early exploration, needs to be connected to @thecodacus flow #856

Iam figuring out how to not alert when reloading once done we can merge both

@wonderwhy-er
Copy link
Collaborator

Here checked how to get errors Early exploration, needs to be connected to @thecodacus flow #856

Iam figuring out how to not alert when reloading once done we can merge both

I think only way to do it is to connect errors to messsges. Considering that those errors are from commands in messages.

This also kinda connects to previous discussions of snapshots.

May be we should not run all terminal commands on reload.
Run only file commands while agrigating terminal commands and then run them in the end once after all file commands finished?

@thecodacus
Copy link
Collaborator Author

May be we should not run all terminal commands on reload. Run only file commands while agrigating terminal commands and then run them in the end once after all file commands finished?

that might works but for that also we need a flag to know that these are coming from reloaded messages. and if thats the case I can directly use that flag to not throw any error.

the problem is to how to set that flag correctly

This also kinda connects to previous discussions of snapshots.

i can try updating that PR to adapt the file load and detect package.json and add the npm commands that we are ding for git imports, might solves this issue, and will save lot of tokens that were used to rewrite the same file multiple times.

@wonderwhy-er
Copy link
Collaborator

I not fully understand this part:
"i can try updating that PR to adapt the file load and detect package.json and add the npm commands that we are ding for git imports, might solves this issue, and will save lot of tokens that were used to rewrite the same file multiple times."

And it also feels like overkill for this PR, I would do it in separate one.

In this one I would connect errors to messages. Sadly have busy day, may be will have time to look at it tomorrow.

@thecodacus
Copy link
Collaborator Author

thecodacus commented Dec 21, 2024

I not fully understand this part: "i can try updating that PR to adapt the file load and detect package.json and add the npm commands that we are ding for git imports, might solves this issue, and will save lot of tokens that were used to rewrite the same file multiple times."

And it also feels like overkill for this PR, I would do it in separate one.

what i meant is, this PR is taking snapshot and using that snapshot on reload, that automatically saves lot of tokens as the previous chat history is not loaded and sent to the subsequent llm calls

i just neeto to modify it to also run the npm install and npm run dev command after reload

@wonderwhy-er
Copy link
Collaborator

wonderwhy-er commented Dec 21, 2024

I not fully understand this part: "i can try updating that PR to adapt the file load and detect package.json and add the npm commands that we are ding for git imports, might solves this issue, and will save lot of tokens that were used to rewrite the same file multiple times."
And it also feels like overkill for this PR, I would do it in separate one.

what i meant is, this PR is taking snapshot and using that snapshot on reload, that automatically saves lot of tokens as the previous chat history is not loaded and sent to the subsequent llm calls

i just neeto to modify it too also run the npm install and npm run dev command after reload

ah, you are trying to many things at once again, like killing 5 birds with one stone :D

You mixed "reload" and "context management"

I want previous chat history available for future calls, may be no files, but text requests and responses I do.

@wonderwhy-er
Copy link
Collaborator

Just for example, this is how bolt.new messages look today:
https://gist.github.com/wonderwhy-er/e92bdc732fc65b8261296672f6c13a0b

They contain full conversation including files, and excluding files will make tokens smaller but it will also make model more confused about what its doing.

We can add what you propose as experimental mode that can be switched off and on to compare how well model performs.

@wonderwhy-er
Copy link
Collaborator

But also, it feels like feature from this PR should work irrespective from snapshots and and context management being implemented and used or not.

@thecodacus
Copy link
Collaborator Author

But also, it feels like feature from this PR should work irrespective from snapshots and and context management being implemented and used or not.

yes context management definitely, dont need that. shapshot would have been great and it would have worked without any of these challenges.. let me see how can I set a reloading flag.

@thecodacus
Copy link
Collaborator Author

@wonderwhy-er done, please check

@wonderwhy-er
Copy link
Collaborator

Something still feels weird about this
https://gist.github.com/wonderwhy-er/caa6f243891c2914e7e396c1381db620

Here have a chat that fails still

But it has issue I was trying to reproduce where AI genreated code in '''json''' markdow code block style
Wanted to fix that
Will now

@thecodacus
Copy link
Collaborator Author

thecodacus commented Dec 21, 2024

Something still feels weird about this https://gist.github.com/wonderwhy-er/caa6f243891c2914e7e396c1381db620

Here have a chat that fails still

But it has issue I was trying to reproduce where AI genreated code in '''json''' markdow code block style Wanted to fix that Will now

saw the chat.. but this is the llm fails to resolve the issue even though it understood the issue and pointed it out correctly.

@thecodacus
Copy link
Collaborator Author

Something still feels weird about this
https://gist.github.com/wonderwhy-er/caa6f243891c2914e7e396c1381db620

are you talking about incase of multiple errors, in this case install failed and start also failed?.. I have added the scope to only show alert for the latest error. but we can queue the errors if you like.

but still think one latest error alert is good enough

@wonderwhy-er
Copy link
Collaborator

wonderwhy-er commented Dec 23, 2024

I want to show errors along the commands we run
image

So that we see them inline of the chat as possible things to fix, not as global toast above the chat.

What do you think?

But we can do it in separate request if you want.

You did plenty of work here already.

I mean, have you seen how windsurf does that? It runs commands in chat and shows errors right besides them and proposes to fix as part of the chat.
Not as some out of chat toasts.
One you did feels out of context in that sense.

@thecodacus
Copy link
Collaborator Author

I want to show errors along the commands we run image

So that we see them inline of the chat as possible things to fix, not as global toast above the chat.

What do you think?

But we can do it in separate request if you want.

You did plenty of work here already.

I mean, have you seen how windsurf does that? It runs commands in chat and shows errors right besides them and proposes to fix as part of the chat. Not as some out of chat toasts. One you did feels out of context in that sense.

this sounds good, but lots of open questions on the execution side.
what happens to the state of the command that failed once we ask ai to fix it.

shall we remove the error popup from that command if the fix worked or how do we define if its fixed if all it did just a file update.

i think its best up finish this PR like this and discuss on enhancement, this is how bolt.new does the error popup. so people already might be familiar with it. once we can brainstorm something better we will just update the implementation

@faddy19
Copy link

faddy19 commented Dec 23, 2024

Lets not overcomlicate and jump 10 steps. First lets fix it how it is. If it works me make the next enhancement. Lets go step by step. Next step would be like windsurf running terminal in chat stream and fixing it directly.

@wonderwhy-er wonderwhy-er merged commit 79e7e75 into stackblitz-labs:main Dec 23, 2024
3 checks passed
@wonderwhy-er
Copy link
Collaborator

Good point of bolt already working like that.
I checked just now and they add error to chat and based on that never show it, though if after that it still fails they show up errors as new one.

Tested, approved and merged

@faddy19
Copy link

faddy19 commented Dec 23, 2024

Good point of bolt already working like that.

I checked just now and they add error to chat and based on that never show it, though if after that it still fails they show up errors as new one.

Tested, approved and merged

Well done @wonderwhy-er

@faddy19
Copy link

faddy19 commented Dec 23, 2024

Let's push the webcontainers errors as well. Will try to contribute and test

@thecodacus thecodacus added this to the v0.0.4 milestone Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants